A novel prognostic model for malignant patients with Gram-negative bacteremia based on real-world research

Gram-negative bacteremia (GNB) is a common complication in malignant patients. Identifying risk factors and developing a prognostic model for GNB might improve the survival rate. In this observational and real-world study, we retrospectively analyzed the risk factors and outcomes of GNB in malignant patients. Multivariable regression was used to identify risk factors for the incidence of GNB, while Cox regression analysis was performed to identify significant prognostic factors. A prognostic model was constructed based on Cox regression analysis and presented on a nomogram. ROC curves, calibration plots, and Kaplan–Meier analysis were used to estimate the model. It comprised 1004 malignant patients with Bloodstream infection (BSI) in the study cohort, 65.7% (N = 660) acquired GNB. Multivariate analysis showed gynecologic cancer, hepatobiliary cancer, and genitourinary cancer were independent risk factors related to the incidence of GNB. Cox regression analysis raised that shock, admission to ICU before infection, pulmonary infection, higher lymphocyte counts, and lower platelet counts were independent risk factors for overall survival (OS). The OS was significantly different between the two groups classified by optimal cut-off value (log-rank, p < 0.001). Above all, a nomogram was created based on the prognostic model, which was presented on a website freely. This real-world study was concentrated on the malignant patients with GNB and proved that shock, admission to ICU before infection, pulmonary infection, higher lymphocyte counts, and lower platelet counts were related to the death of these patients. And a prognostic model was constructed to estimate the risk score of mortality, further to reduce the risk of death.


Materials and methods
Study population. The malignant patients who were suspicious of BSI and delivered blood culture samples from July 2012 to September 2020 in Sichuan cancer hospital were retrospectively enrolled in this research. This research included inpatients who were aged 16-year or older. The pathology of all malignant patients was diagnosed by the pathology department in Sichuan cancer hospital and based on the diagnostic guidelines. The Centers for Disease Control and Prevention (CDC) definition for nosocomial infections was used as a reference to diagnose the BSI and GNB 10 . The clinical condition including primary infection sites, com-morbidities, therapeutic tools, and laboratory results on the day blood culture samples delivered was reviewed and recorded. The culture results and survival conditions of all enrolled patients were also recorded. The patients with incomplete information and blood culture contamination results were excluded. Study design. The study design flow diagram was shown in Fig. 1. Patients whose blood-culture samples were obtained between July 2012 and September 2020 were collected for the study cohort. The number of malignant patients who underwent blood-culture tests was 18,900 (5.3% of 355,336 visits in the period of the time). The number of patients who had positive blood culture results was 1052. According to the isolation of blood culture, the enrolled patients were separated into Gram-negative bacilli infection (Gram-negative Bacteremia (+)) group and non-Gram-negative bacilli infection (Gram-negative Bacteremia (−)) group. Multivariable regression was used to identify risk factors for the incidence of GNB. According to the 30-days survival status, the patients in Gram-negative Bacteremia(+) group were separated into survivors and nonsurvivors groups. Cox regression analysis was performed to identify significant prognostic factors. And a prognostic model was constructed based on Cox regression.
New cohort validation data came from patients whose blood-culture samples were obtained between October 2020 and December 2021 in Sichuan Cancer hospital. The number of malignant patients who underwent blood-culture tests was 3864. 115 GNB positive patients were finally involved in a new validation cohort, and also according to the 30-days survival status, patients were separated into survivors and nonsurvivors groups.
The time ROC curves, calibration plot, and Kaplan-Meier analysis were used to estimate the model in testing group and new validation cohort. After all, a nomogram was created based on the prognostic model and presented on a website. Laboratory methods. The blood samples were collected before initiation of antibiotic treatment using BD BACTEC Standard Anaerobic and Aerobic medium (Becton Dickinson, Sparks, MD, USA), and then were cultured in BD BACTEC FX blood culture automated systems (Becton Dickinson, Sparks, MD, USA). The Sensititre ARIS 2X (Thermo Fisher Scientific, 81 Wyman Street, Waltham, UK) automatic susceptibility and identification system were used for bacterial identification and drug susceptibility testing. Disc diffusion test (Oxoid Ltd/ Thermo Fisher Scientific, UK) was used to supplement Antibacterial susceptibility testing. Results Interpretation Reference to the guidelines of the Clinical and Laboratory Standards Institute(2017) 11 . Escherichia coli ATCC 25,922, and Pseudomonas aeruginosa ATCC 27,853 were used as internal quality control.
Procalcitonin (PCT) was measured using an automated immunofluorescent assay (Brahms KRYPTOR, Hennigsdorf, Germany). The normal PCT concentration was defined as < 0.10 ng/ml. C-reactive protein (CRP) levels were measured by nephelometry (Goldsite Diagnostics Inc, Shenzhen, China). The routine blood tests were measured by a blood routine analyzer (Mindray Medical International, Shenzhen, China). We used quality controls that were regularly checked by the National Center for Clinical Laboratory. All methods were carried out following relevant guidelines and regulations. Statistical analysis. Data management, statistical analyses and all the figures were conducted using R version 4.0.3. Clinical characteristics of the participants were summarized by median and inter-quartile range for continuous measures and counts with proportions for categorical features. The training and testing cohorts of GNB patients were selected by the random split-sample method (split ratio: 7:3) 12  www.nature.com/scientificreports/ infection status, primary infection site, and treatment status were compared using Chi-square and t test. Analyses of laboratory features were performed by the Kruskal-Wallis test, and the Mann-Whitney test was used for the two groups' comparison. Then all information and laboratory features were added into multivariable logistic regression analyses to select the risk factors for the incidence of GNB-BSI infection 6,7,13 . The GNB associating variables with a p-value less than 0.05 were candidated for backward stepwise multivariate analysis with the Akaike information criterion (AIC) to investigate independent risk factors. According to regression results, the potential risk factors were performed in multivariate analyses to select the best-fit model. Multivariable timeto-event analysis was performed using Cox proportional hazards regression models to develop a nomogram using weighted estimators corresponding to each covariate derived from fitted Cox regression coefficients and estimates of variance. Survival curves were depicted using the Kaplan-Meier method and compared using the Figure 1. Study design. This research began with records of all patients whose blood-culture samples were obtained between July 2012 and September 2020. All participants were separated into training and testing cohort. And a new validation cohort were obtained between October 2020 and December 2021. A prognostic model was constructed using a training cohort estimated in testing and new validation cohorts. Finally, an online nomogram was generated based on the prognostic model. Ethical approval and consent to participate. The study was approved by the medical ethical committee of Sichuan Cancer Hospital (SCCHEC-02-2022-001), which waived the requirement for informed consent owing to the retrospective design of the study.

Results
Clinical characteristics. A total of 1119 patients were eligible for this study, which included 775 (69.3%) with GNB and 344 (30.7%) with other bacteremia. Based on a rule of thumb for sample size 12 , the sample size needed in this study was to have at least 10 outcome events per parameter estimating, and thus the total needed sample size was calculated at least 430 patients. So, all available data were used to maximize the power and generalizability of the results. While there were 660 patients with GNB and 344 patients with other bacteremia in the study cohort, the basic information of study cohort was presented in Table S1. All the GNB patients were divided into two groups which consisted of the training cohort (n = 459) and the internal testing cohort (n = 201). Then a new validation cohort including 115 patients with GNB was collected. The most seen underlying disease in these three groups was gynecologic cancer and upper gastrointestinal cancer. And the most common primary infection site of the three groups was blood, followed by pulmonary infection and urinary tract infection. The demographics and characteristics of all patients in different data sets were summarized in Table 1. Table S1, in the study cohort, the patients with gynecologic cancer, upper gastrointestinal cancer, and hepatobiliary cancer were more likely to get GNB. And the primary infection, such as pulmonary, urinary tract, and soft tissue might also lead to the GNB. Admission to ICU before infection, chronic obstructive pulmonary disease, and primary antibiotic exposure were also related to GNB. The analysis of laboratory features showed that higher PCT level (1.23, ng/ml, IQR (0.36-8.54), P < 0.05 and lower lymphocyte count level (0.42, *10 9 /L, IQR (0.22-0.73), P < 0.05) were related to GNB (Fig. S1).

Risk factors contributing to the incidence of GNB. As shown in
The multivariate logistic regression analysis ( Fig Table S2.  Table S3. According to the multivariate logistic regression analysis, shock, admission to ICU before infection, pulmonary infection, nasogastic tube, and lower platelet counts (PLT) were related to the poor prognostic (Fig. S2).
Prognostic model constructed using Cox-regression. The training and testing cohorts of GNB patients consisted of 459 and 201 cases, respectively. The characteristics of GNB patients in the training and testing cohorts were similar to those in the total cohort (Table 2). A new validation cohort was also collected, and the baseline information was also shown in Table 2. The characteristics of the new validation cohort in the survival and non-survival patients were similar to the training and testing cohort. Based on multivariate Cox proportional hazards regression analyses, five independent prognostic factors were identified in the training cohort, which were shown in Fig. 2B. The five factors were shock (HR = 4.625, P < 0.001), admission to ICU before infection (HR = 3.060, P < 0.001), pulmonary infection (HR = 2.316, P = 0.003), higher lymphocyte counts (HR = 1.512, P = 0.007) and lower PLT counts (HR = 0.993, P < 0.001), respectively. The prognostic model was constructed with the five factors. For each outcome, coefficients and hazard ratios (HRs) were calculated, and the coefficients were used to weight each factors of the model. The formula of the model dispayed as follows: h(30 days) presented the 30-day survival probability of malignant patients with GNB; h0(30 days) was a constant; ICU represented admission to ICU before infection; pulmonary.infection represented primary infection before GNB was pulmonary infection; shock represented that the patients got shock after GNB; lymphocyte represented lymphocyte counts (*10 9 /L), and PLT represented PLT counts (*10 9 /L).
Estimation of the prognostic model. According to the survival probabilities calculated by the model, the time ROC was used to evaluate the diagnostic value of death caused by GNB. The AUCs for the 7-days, 15-days and 30-days were 0.80, 0.82 and 0.82 in the training cohort, respectively. And in the testing cohort, the AUC values of the ROC projected the7-days, 15-days and 30-days were 0.77, 0.78 and 0.82, respectively ( Fig. 3A and B). The calibration curve indicated a good agreement between the actual observations and predictions model using the model in both training cohort (Fig. 3C) and the testing cohort (Fig. 3D).
The optimal cut-off value to discriminate nonsurvivors from survivors was 0.929 according to the 30-day ROC curves. The two cohorts were separated into high-risk and low-risk groups based on the probability of h 30days = h0 30days * exp(1.1183 * ICU + 0.8398 * pulmonary.infection + 1.5314 * shock + 0.4131 * lymphocyte − 0.0072 * PLT).  www.nature.com/scientificreports/ www.nature.com/scientificreports/ 0.929. The Kaplan-Meier analysis was performed to evaluate patients' OS in the two groups. The results showed that patients in the high-risk group had shorter OS (P < 0.001), indicating a significant unfavorable outcome for high-risk GNB ( Fig. 3E and F).

P-value
In the new validation cohort, the calibration curves also displayed high consistency in the prediction of GNB's survival time (Fig. 4A). The AUCs for the 7-days, 15-days, and 30-days were 0.91, 0.90, and 0.89 in the new validation cohort (Fig. 4B), which suggested the good prediction capability of this model. The new validation was separated into high-risk and low-risk groups based on the probability of 0.929 as previously. The Kaplan-Meier analysis was performed to evaluate patients' OS in the two groups. The results showed that patients in the highrisk group had shorter OS (P < 0.001) (Fig. 4C).
In order to validate this model in malignant patients with suspected GNB, we collected 50 GNB patients (NB), another 50 fever patients who were proved to be gram-positive bacteremia (PB), and 50 fever patients who were proved to be with no bloodstream infection (NonB). The basic information of the 150 individuals were supplied in Table S4. The AUCs for the 7-days and 30-days were 0.87, and 0.82 in this validation cohort, the sensitivity for the 7-days and 30-days were 0.65 and 0.87 and the specificity for the 7-days and 30-days were 1.00 and 0.64, respectively. These AUCs in independent PB and NonB also suggested a good prediction of this model (Table S5).

Development of a web server presenting the prognostic nomogram.
Based on the prognostic model, a nomogram predicting 30-day survival probabilities in all the patients with GNB was generated. An online version of our nomogram (Fig. 5) could be accessed, which could help clinicians and patients easier access our new model. Predicted survival probabilities across time could be easily determined by inputting clinical and laboratory features, while the reading output figures and tables were also generated by the webserver. The website was shown in the supplementary file.

Discussion
As the mortality in malignant patients remains high, bloodstream infections is a common, deadly, and costly complications 14 . And gram-negative bacilli was the most frequent bacteria cause of BSIs in malignant patients 15,16 . But there were still very little researches on the risk factors and outcomes of malignant patients with GNB. With www.nature.com/scientificreports/ cancers, due to the surgery and radiochemotherapy, there might be non-inflammatory fever and changes in some lab results. And most cancers were in the chronic station, which might lead the malignant patients to survive with cancer for several years. A model constructed with the information of malignant patients could be more specific for malignant patients. So we devoted ourselves to finding the risk factors for the mortality in those patients, and integrated these factors as a prognostic model, which could provide evidence for the clinicians making a decision. In our research, the GNB took about 69.3% of all BSI patients with cancers, the same as previous research, which meant GNB has become an important cause of BSI in patients with cancers 17 . The risk factors for GNB infection in our analysis were also similar to those reported in the previous publications 6,18 . The patients with gynecologic cancer, hepatobiliary cancer, and genitourinary cancer were more likely to get GNB. In gynecologic cancer surgery, the prolonged use of surgical drains was a risk factor for surgical site infection. And gynecologic cancer patients' fecal carriage of bacteria might increase the risk of bloodstream infections 19 . For these patients, active surveillance of gram-negative bacilli was proved to be an effective strategy to limit the occurrence of GNB in hospital. Postoperative mortality and morbidity rates after hepatobiliary-pancreatic surgery remained high, and enterobacteriaceae were the most common microorganisms that were isolated from these patients. So for hepatobiliary cancer patients, these findings highlighted the importance of safe patient care practices, and the importance of preventing infection 20 . PCT and CRP were thought associated with BSI. Serum PCT concentrations were higher in patients with GNB than in patients with Gram-positive bacteremia or candidemia 21 . Whereas, CRP proved useless in predicting bacteremia, which was similar to our study 22,23 .
The 30-day mortality for GNB was 12.27% in this real-world cohort, which was similar to the prior report 24 . We found shock, admission to ICU before infection, pulmonary infection, higher lymphocyte, and lower PLT were independently associated with high mortality in patients with GNB. The sepsis and sepsis shock always came along with the GNB occurrence, while sepsis shock could lead to higher mortality. The pathogenesis of sepsis shock involves many complex cellulars and biochemical interactions between leukocytes, platelets, endothelial cells, and the complement system that triggered an inflammatory response leading to multi-organic failure 25 . Organ dysfunction and the attendant complications of treating the organ dysfunction lead to a high risk of morbid complications and death 26 . Admission to the ICU in the cancer population was associated with high mortality and did not result in benefit from subsequent cancer treatment 27 . Multidrug-resistant organisms on patient's hands in an ICU setting could be one of the reasons 28 . Most gram-negative bacilli produced necrotizing bronchopneumonia with hemorrhage and abscess formation. Some virulent gram-negative species, such as klebsiella, lead to necrosis, bacteremia, and shock with a propensity to infect the pulmonary microvasculature 29 .  www.nature.com/scientificreports/ Lower platelet counts always reflected poor nutrition and immunity, which indicated the patients had a higher risk of GNB and a poor prognosis 30 . As proved in our study, malignant patients with GNB in the death group had lower platelet counts than patients in the survival group.
As reported, the 180-day mortality rate due to septic shock was higher in cancer patients compared with non-cancer patient 2 . APECHII and SOFA were widely used in the evaluation of infectious shock but were limited for the complex calculation and subjective estimation. So a prognostic model was required to estimate the dead risk of the malignant patients with suspicious GNB more objectively and fastly. Nomograms have previously been widely used in the oncology literature to help patients evaluate the risk of disease progression and mortality 31,32 For that, a model to predict 30-day mortality in patients with GNB was constructed with a common clinic and laboratory features and presented with online nomogram. No matter applied in training or testing cohort, even a new validation cohort, the model performed well in time ROC. And all the data in this research were real-world data with no intervention from the researchers, which could provide more reliable evidence for this model. The factors used to construct the prognostic model were the results when the blood culture samples were delivered to the laboratory. So the model could be used when the malignant patients were suspicious as GNB and retrieved blood culture samples. If the patients have the high-risk factors for the GNB indicated in our research, they could be evaluated by the model. As the factors were accessible and objective, the dead risk could be evaluated more fleetly and reliably using a nomogram or web tool than APECHII and SOFA. If a high risk was hinted at, the clinicians could intervene more soon. Therefore, we hope this tool could help the clinicians avoid inappropriate treatment and control clinical indicators which were influential in mortality earlier. Combining with novel molecular and phenotypic rapid tests for identification might show potential for favorable influences on patients' outcomes 33 . Early goal-directed therapy provided significant benefits to outcomes in patients with severe GNB 34 . However, the population used in the model construction and validation might limit the model application. In future work, more malignant patients with suspicious BSI would be further followed, which could validate the existing model in other populations, and provide a larger sample size to construct a new model for more application.

Conclusion
In conclusion, we have described risk factors for incidence and mortality of GNB in malignant patients based on real-world data, which could provide an accurate and generalizable assessment of the key risk factors for infection and subsequent patient outcomes. Based on our findings, further researches could focus on the risk of morbidity and mortality for specific cancer. Additionally, Sichuan cancer hospital as the Cancer Control and Prevention Center in Sichuan province provided a large number of malignant patients, which made this model more reliable for the malignant patients. But external validation with a prospective cohort is still required, which has been an area of planned future study. To our knowledge, this is the first study to configure a nomogram to predict 30-day mortality in malignant patients with GNB. And the model was presented on the website (Supplementary file), which could be widely used by the clinicians freely.